Dynamic Branch Decoupled Architecture

نویسندگان

  • Akhilesh Tyagi
  • Hon-Chi Ng
  • Prasant Mohapatra
چکیده

We propose an alternative approach to branch resolution based on the earlier work on decoupled memory architectures. Branch decoupling is a technique to decouple a single instruction stream program into two streams. One stream is solely dedicated to resolving branches as early as possible (both the branch condition and the branch target). The resolved branch targets are consumed by the other computing stream through a queue. We have proposed a compiler based, static branch decoupling methodology earlier. In this paper, we propose a dynamic branch decoupled (DBD) architecture. Simulations show a speedup of for SPEC95 integer benchmarks and for SPEC95 FP benchmarks over a 2-level adaptive branch predictor. The average number of branch penalty cycles per instruction for DBD reduces to compared to for the 2-level

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Decoupled Fetch-Execute Engine with Static Branch Prediction Support

We describe a method for supporting static branch prediction on a decoupled fetch-execute pipeline. Using instruction buffers to decouple instruction fetch from the execute pipeline is an effective way to minimize instruction cache penalties by allowing instruction fetch and stall miss handling to proceed independent of the execution pipeline. Dynamic branch prediction is typically used with su...

متن کامل

An Integrated Partitioning and Scheduling Based Branch Decoupling

Conditional branch induced control hazards cause significant performance loss in modern out-of-order superscalar processors. Dynamic branch prediction techniques help alleviate the penalties associated with conditional branch instructions. However, branches still constitute one of the main hurdles towards achieving higher ILP. Dynamic branch prediction relies on the temporal locality of and spa...

متن کامل

Tolerating Branch Predictor Latency on SMT

Simultaneous Multithreading (SMT) tolerates latency by executing instructions from multiple threads. If a thread is stalled, resources can be used by other threads. However, fetch stall conditions caused by multi-cycle branch predictors prevent SMT to achieve all its potential performance, since the flow of fetched instructions is halted. This paper proposes and evaluates solutions to deal with...

متن کامل

Survey the Security Function of Integration of vehicular ad hoc Networks with Software-defiend Networks

In recent years, Vehicular Ad Hoc Networks (VANETs) have emerged as one of the most active areas in the field of technology to provide a wide range of services, including road safety, passenger's safety, amusement facilities for passengers and emergency facilities. Due to the lack of flexibility, complexity and high dynamic network topology, the development and management of current Vehicular A...

متن کامل

Characterization of embedded applications for decoupled processor architecture

Needs for performance on embedded applications will lead to the use of dynamic execution on embedded processors in the next few years. However, complete out-of-order superscalar cores are still expensive in terms of silicon area and power dissipation. In this paper, we study the adequation of a more limited form of dynamic execution, namely decoupled architecture, to embedded applications. Deco...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999